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Abstract 

This paper introduces the new data-dependent multiplier bootstrap for non-parametric analysis of survival 
data, possibly subject to competing risks. The new resampling procedure includes both the general wild boot¬ 
strap and the weird bootstrap as special cases. The data may be subject to independent right-censoring and 
left-truncation. We rigorously prove asymptotic correctness which has in particular been pending for the weird 
bootstrap. As a consequence, pointwise as well as time-simultaneous inference procedures for, amongst others, 
the classical survival setting are deduced. We report simulation results and a real data analysis of the cumula¬ 
tive cardiovascular event probability. The simulation results suggest that both the weird bootstrap and use of 
non-standard multipliers in the wild bootstrap may perform preferably. 
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1 Introduction 


Non-parametric inference for time-to-event data is often hindered by asymptotic non-pivotality. The Nelson-Aalen 
(NAE) and Aalen-Johansen estimators (AJE) converge weakly to Gaussian processes on a Skorohod space, but 
their covariance functions depend on unknown quantities. This problem is attacked by plug-in of estimates and, in 


the absence of competing risks, transformation of the limit distribution towards a Brownian bridge (e.g., Andersen 


et al. 1993| Section IV. 1.3.2). The latter approach, however, fails if interest lies in the cumulative event probability 
of a competing risk, and resampling techniques are needed. Moreover, even if asymptotic pivotal approximations 
are available, resampling is well known to often perform advantageously in small samples. 

In an i.i.d. setting, resampling typically uses the classical bootstrap of Efron ( 1979| l, extended to the Kaplan- 
Meier estimator by repeatedly taking random samples with replacement from the randomly censored observations 
(Efron 1981| l. Theoretical justifications were provided by Akritas ( 1986| l and Lo and Singh ( |1986| l. The latter 
authors also suggested that their method of proof extends to the situation of competing risks. In an article about 


weak convergence for quantile processes and their bootstrap versions, Doss and Gill (1992i briefly discussed 
resampling inference for the latent failure time of a competing risk. 

Another popular resampling method traces back to Lin ( 1997| l, see also the textbook treatment by |Martinussen 
and Scheike (2006 1 . Lin’s idea was to consider the martingale representation of the AJE which originates from the 


Doob-Meyer decomposition for counting processes and does not necessarily require an i.i.d. setup (e.g., Andersen 
|et aH |1993[ ). Lin suggested to replace the martingale increments by the increments of the observed counting 
processes, reweighted by standard normal variates. The approach has recently been recognized as a special case of 
the wild bootstrap (e.g., [Beyersmann et al. 2013| l, where the weights are required to have mean zero and variance 
one, but need not be normal. 

Yet another, earlier suggestion for a general resampling procedure for time-to-event data is the weird boot¬ 


strap due to Andersen et al. (1993 Sec. IV. 1.4), but it has slightly fallen into oblivion. Andersen et al. formu¬ 
lated their ideas for approximating the empirical distribution of the standardized NAE, say Vn = s/n{A — A). 
Since the counting process increments dN{t) that enter the NAE A{t) have the same conditional variance as 
B{Y (f), dA(t))-distributed binomial random variables, given the risk set Y(t) at t—, it seems natural to consider 
a corresponding weird jump process N* with independent and B{Y (t), dA(f))-distributed increments at the jump 
times of A^. This results in a so-called weird bootstrap NAE version 14, = s/n{A* — A) = ^/n J l/Y(dN* — dN). 
At first sight, this bootstrap is ‘weird’ in that the number at risk is not changed in the bootstrap step and thus each 
individual may cause several simulated events. At second sight, however, the weird bootstrap is a very natural ap¬ 
proach as discussed in Sectionj^below. Andersen et al. sketched a theoretical justification for weird bootstrapping 
the NAE, but — as also Ereitag (2000 p. 38) pointed out — a rigorous proof has not been given. 


Although the weird bootstrap has been implemented in the functions censboot and coxreg of the R packages 
boot and eha, respectively, (for the latter, see Appendix D.2 of Brostrom 2012| l, Efron’s bootstrap and the 
wild bootstrap with standard normal weights are the most popular resampling schemes in the survival literature. 
Exceptions using the weird bootstrap are |Dudek et al.|p008| l and |Eledelius et al?] ( |2004| l. Dudek et al. empirically 
found superiority of some weird bootstrap confidence bands for the cumulative hazard rate compared to using 
Efron’s approach. Eledelius et al. studied residual lifetimes and proposed the weird bootstrap for a kernel density 
estimator of the hazard rate. These authors accounted for both the age of individuals under study and calendar 
time, leading to a two-dimensional time parameter. Weak convergence is shown for arbitrary single points of time, 
but not time-simultaneously, yielding confidence intervals rather than confidence bands. Eor another brief textbook 
treatment of the weird bootstrap, see also Davison and Hinkley \\991 Sections 3.5 and 7.3). 

The aim of this paper is to introduce and rigorously justify a new resampling procedure, the data-dependent 
multiplier bootstrap (DDMB) for non-parametric analysis of survival data, possibly subject to competing risks, 
that includes both the general wild bootstrap and the weird bootstrap as special cases. The data are assumed to be 
subject to independent right-censoring and left-truncation, but a strict i.i.d. setup is not required. (In fact, all that is 


really needed is the multiplicative intensity model, see, e.g., Aalen et al. (2008 Section 3.1.2)). As a byproduct, our 


development includes a rigorous proof for the original weird bootstrap. In contrast to the classical wild bootstrap. 
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the new procedure allows for both non-i.i.d. weights and data-dependent weights. Expressing the weird bootstrap 
as a DDMB, the corresponding multipliers approximately correspond to independent Poi{l) variates for large 
numbers of individuals at risk, arriving at a special wild bootstrap version as studied in |Beyersmann et ^ ( 2013| l. 

For ease of presentation, we formulate our developments for the AJE of the cumulative incidence functions 
(CIFs) in a competing risks setting. This includes the standard survival scenario in which there is only one event 
type. For applications of the DDMB, we study the testing problem of equality of CIFs from two independent groups 
(see also, e.g., Bajorunaite and Klein [2007 2008[ Dobler and Pauly 2014 20151, and we construct asymptotically 
valid confidence bands for CIFs (see also, e.g., |Lin[|1997[|Beyersmann et al. 201 3| l. 

This article is organized as follows. Section [^recaps the properties of the competing risks model under con¬ 
sideration, the quantity of interest (the CIF) and its canonical estimators. The DDMB and its special forms are 
introduced and analyzed in Section and applications for the two-sample testing problem as well as for time- 
simultaneous confidence bands are given in Section]^ Small sample performance of confidence bands is assessed 
in a simulation study in Section]^ The simulation setup has been chosen similar to a randomized clinical trial on 
cardiovascular events in diabetes patients ( [Wanner et al.| |2005| ), and real data from this trial are then analyzed in 
Section]^ Finally, we give some concluding remarks in Section]^ All proofs are deferred to the Appendix. 


2 Notation, Model and Estimators 


The ordinary survival setup is generalized to a competing risks process {Xt)t>o with m S N competing risks. This 
is a non-homogeneous Markov process with state space {0, 1, ..., m} and initial state 0, i.e., P{Xq = 0) = 1. All 
other states 1,. .., m represent absorbing competing risks. For ease of notation we only discuss the case m = 2 
with two absorbing states since generalizations to m > 3 are obvious. The event time T = inf{f > 0 | Xt ^ 0} is 
assumed to be finite a.s.. The process behaviour is regulated by the transition intensities (or cause-specific hazard 
functions) between states 0 and j = 1,2, denoted by 


aj{t) 


lim 

At\,0 


p(re [t,t + At),XT 
At 


j \T>t) 


7 = 1,2. 


( 2 . 1 ) 


Throughout we assume that ai and a 2 exist. One is often interested in the development of the competing 
risks process in time on a given compact interval / C [0,t). Here r is an arbitrary terminal time such that 
T < sup{u : (ai(s) + a 2 (s))ds < c»} whence P(T > •) > 0 on [0,r). For a detailed motivation and more 
practical examples for occurrences of competing risks designs we refer to [Andersen et al.j ( [1993[ l, [Allignol et al.[ 
( 2010[ l as well as Beyersmann et al. (20121. 


Beyersmann et al. (2012 


For n independent replicates of the competing risks process, i.e. n individuals under study, we now consider 
the associated bivariate counting process N — {Ni, N 2 ). Here Nj = X]r=i = 1,2, with 


= 1 ( Subject i has an observed (0 —?► j) - transition in [0, f]) 


( 2 . 2 ) 


counts the number of observed transitions into state j G {1,2}, where !(•) denotes the indicator function. As 
usual, it is postulated that the processes A^i and X 2 are cadlag and do not jump simultaneously. Moreover, we 
assume that TV fulfills the multiplicative intensity model given in Andersen et al. (19931, i.e., its intensity process 
A = (Al, A2) is given by 

\j(t)=Y(t)a,{t), 7 = 1,2. (2.3) 

Here Y = YJi=iYi and 


Yi{t) = 1 ( Subject i is in state 0 at time t —), 


(2.4) 


i.e. Y counts the number at risk immediately before time t. It is worth to note that the multiplicative intensity 
model holds, for instance, in the context of independent right-censoring or left-truncation; see Chapter III and IV 
Andersen et al. ( 1993[l. Moreover, even different censoring distributions are possible; see Example IV. 1.6 in the 


in 
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same textbook. For the explicit modelling of these incomplete observations in various settings we again refer to 
the monograph of [Andersen et al.| ( |1993| l. Other kinds of multiplicative intensity models are also conceivable in 
combination with the present theory such that the number at risk process Y may be replaced with a more general 


predictable process depending on the model that describes the data; see the examples in Section 3.1.2 of Aalen 
[HlLld^OOSl l. 

We are now interested in the cumulative incidence functions (CIFs), or sub-distribution functions, given by 

Fj{t) = P{T < t.Xr = j) = [ P{T > u-)aj{u)du, j = 1,2, 

which depend only on the cause-specific transition intensities ai and a^- The corresponding sub-survival functions 
are denoted 5j(f) = 1 — Fj{t), j = 1, 2 and the Aalen-Johansen estimators for the CIFs are 


m = 


P{T > U-) 
Y(u) 


J{u)ANj{u), j = l,2. 


(2.5) 


Here J{u) = 1(T(m) > 0) (such that ^ ;= 0) and P{T > u) denotes the Kaplan-Meier estimator. As above we 
denote the estimator for the sub-survival function by Sj{t) = 1 — Fj{t). Note that the usual survival scenario is 
obtained by letting a 2 = -^2 = 0 so that Si (t) reduces to the Kaplan-Meier estimator. 

Simultaneous confidence bands for a CIF, say Fi, are typically based on the Aalen-Johansen process via 

IT„(-)=ni/2{Fi(.)-Fi(.)}. (2.6) 

Under the following throughout assumed regularity assumption (where y : / —> K is a deterministic function) 

Y{u) 


sup 

uGl 


- y{u) 


■ 0 with inf y[u) > 0, 

u£l 


(2.7) 


Wn converge s in distribution on the Skorohod space I7(/) to a zero-mean Gaussian process C/; see e.g. Theo- 

|l993 I. Here and throughout the paper, “ ” denotes convergence in probability. 


Andersen et al. 


rem IV.4.2 in 

whereas “ —)- ” stands for convergence in distribution as n —oo. In particular, we have 

Wn^U on !?(/), 

where t/ is a zero-mean Gaussian process with covariance function given by 

{S 2 {u) - F’i(s 2 )}{S' 2 (u) - Fi{si)}ai{u) 


( 2 . 8 ) 


C(si,S2) = / 

Jo 


+ 


y{u) 

{Fi{u) - Fi{s 2 )}{Fi{u) - Fi{si)}a 2 iu) 
y{u) 


du 


dit 


(2.9) 


for Si < S 2 . This martingale-based weak convergence result follows from the representation 


Wnit) = i/n 
where for 1 < i < n, j = 1, 2, 




S2{u)-Fi{t) 

Y{u) 




Fi{u)-Fi{t) 

Y{u) 


dM2;,(u)) +Op(l), (2.10) 


= Nj.i{s) - f Yi{u)aj{u)du 
Jo 

are square integrable martingales. For ease of notation the dependency on n and the appearance of the indicator 
J{u) is suppressed in both integrals in (2.10i. The convergence in ( |2.8] l finally follows from ( |2.7| ) in combination 
with Rebolledo’s martingale central limit theorem (see Andersen et al. [1993 Theorem II.5.1). Note, that the main 
assumption \2.1) is satisfied in most relevant situations, e.g., for right-censored and left-truncated or even filtered 
data; see Sections III and IV in jAndersen et al.| ( |1993| ). 

Since the covariance function C, is unknown and lacks independent increments, resampling techniques are 
essential for approximating the distribution C{Wn) of Wn- Therefore, we introduce a general DDMB method. 
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3 The Data-Dependent Multiplier Bootstrap 


The last mentioned covariance problem is typically attacked using a computationally convenient resampling tech¬ 
nique which is due to Lin et al. ( 1993| l and Lin (1994 1997| l. Their idea is to replace the unobservable martingales 
Mj-i in the representation ( 2.10| l with i.i.d. standard normal variates Gj-t, i G N, j = 1, 2, (which are independent 
from the data) times the observable counting processes Moreover, all remaining unknown quantities in ( 2.10| ) 
are replaced with their estimators. This leads to the following resampling version of Wn according to Lin \\991\ : 


Wn{t) = 


Gi,,{S2{u) - F^{t)) 
Y{u) 


dA^i;i{u) + 


Y{u) 


dA^2;i(u)); 


see also Beyersmann et al. (20131 where the validity of this approach is proven for the even more general wild 
bootstrap with i.i.d. zero-mean random variables i S N, j = 1, 2, with variance 1 and finite fourth moments. 
This means that the conditional distribution of Wn asymptotically coincides with that of Wn- Hence its law may 
be approximated via a large number of realizations, repeatedly generating i.i.d. multipliers Gj-^i- In the following 
we show how to generalize this method to the case of data-dependent multiplier weights {Dn-,i)i,n which are only 
supposed to be conditionally independent given the data. An advantage of this approach is the possibility to weight 
the individual subjects in diverse ways. For example, certain preferences (e.g. depending on the time under study) 
can be taken into account, specifically arriving at the weird bootstrap from |Andersen et ak ( 1993| l; see Examples [T] 
below. To this end we rewrite Wn as 


Wn{t) — (^Gl-iXn;l;i{t) + G2;iXn;2;t 

i=l 

where for s G I and i = 1,... ,n 


2n 


— G*t-^2n;z (^) 5 


(3.1) 


Xn;l;i{s) — / 

■Jo 


S 2 iu)-Fi{s) 

Y{u) 


J{u) dNi.i{u), Xn-, 2 -t(s) = f 

Jo 


Fi{u)-Fi{s) 

Y{u) 


J{u) dN 2 -iiu), 


Gi = Gi-il{i < n) + G 2 ;i-nl(* > n) and Z 2 n-,i ■= + Xn-, 2 ;i-nd{i > Ti). That is, we obtain a 

linear weighted representation as in Dobler and Pauly ( |2014| l. 

Now replacing the i.i.d. weights Gi in •HD with data-dependent multipliers {D 2 n;i)i,n, we arrive at the so- 
called DDMB version of the normalized Aalen-Johansen estimator 




(3.2) 


i=l 


These bootstrap weights also need to fulfill regularity conditions concerning their conditional moments in order 
to induce conditional finite-dimensional convergence and tightness. In particular, the Conditions ( |3.3| )-( |T7 ] i below 
guarantee the validity of this approach, i.e. the weak convergence on the Skorohod space T>{I) to the Gaussian 
process U. Its proof depends on an application of Theorem 13.5 of Billingsley ( 1999| l and is split up into two parts; 
see Lemma [8T| and 8.2 in the Appendix. 


For the purpose of applying the theory developed in this paper, we again stress that only the multiplicative 
intensity model ( |2.3| l and Condition ( |2.7[ ) are required. Hence, all available information is given by the processes 
1 1 —>■ (Yi{t), Ni-^i{t), A 2 ;i(f)) for alH = 1,..., n and the cr-field containing (at least) all this information is denoted 
An- This scenario includes, for example, independent left-truncation and right-censoring in which case we can 
equivalently write An — cr{Li, Ti,5i '■ i = 1,. .., n). Here Li denotes the entry time into the study for individual 
i, Ti > Li is its event or censoring time, whichever comes first, and Si indicates the type of event in case of 
G {1, 2} or a censored observation for Si = 0. 

Further, the product measure of (conditional) distributions Pi,i = 1,..., n, is indicated by Pi and the 

notation Vn G Op{rn) describes the following boundedness property in probability: there exists a constant G > 0 
such that -f Op(l) < G a.s. for all n. 
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Theorem 1. Suppose that ( |2.7[ ) holds and that the DDMB weights {D 2 n-,i)i,n fulfill 

max := max \E[D 2 n-i\An]\ G Op{n~^^'^), (3.3) 

l<i<2n l<i<2n 

max |cr2 . _x|.= jnax |var(L)2n;i|-4„) - 1| S Op(l), (3.4) 

l<i<2n ’ l<i<2n 

max E[D 2 „.i|^„] e C>p(n), (3.5) 

l<i<2n ’ 

>C(£> 2 n;i,* = 1, ■ ■ • ,2n I ^„) = C{D2n-i\An)- (3.6) 


If in addition {D 2 n-i)i^n satisfy the following conditional Lindeberg condition in probability given A„ 


^ r (£*2n;i — i ( {D2n-,i — . \ 


An 


0 for all e > 0, 


(3.7) 


E n ^ 

j=l 

then the DDMB version of the AJE converges in distribution on the Skorohod space D{I) to the Gaussian process 
U given in IZSl ). I.e., given An we have 

2n 


—> U in probability. 


Remark 1. 

(a) The involved Lindeberg condition is implied by combined with 

max E[D2nn\An\ € Op{rf) for some e S (0,1), 

l<i<2n ’ 

instead of Condition (133J since the multipliers D 2 n-,i then induce a conditional Lyapunov central limit theorem. 

(b) In Theorem^it is important that the DDMB weights are not influenced by the data in the limit. Lor example, 
the asymptotic variances should be 1 regardless of the actual data. In this way DDMB weights and (non- 
identically distributed) wild bootstrap weights are seen to be equivalent asymptotically. 


The conditions of Theoremare satisfied for the following resampling schemes. 

Examples 1. (a) The wild bootstrap as in Beyersmann etal. { 201 3\ with i.i.d. multipliers D 2 n-i = Gi having mean 
zero, variance 1 and finite fourth moment falls under our approach. 

(b) As special cases of (a) we obtain the resampling technique oJ% ^ ( |7997| ) with i.i.d. standard normal weights 

Gi as well as the Poisson-wild bootstrap with data-independent weights D 2 n;i * Pot(l) — 1. 

(c) Moreover, even a wild bootstrap with non-identically distributed random variates Gi, all having mean zero, 
variance 1 and finite fourth moment, is covered. 

(d) Another example is the weird bootstrap of ^ndersen et al] 7993| Section IV.1.4). Lor simplicity, we abbreviate 

{Mi)i := {Mj.^i)ij and {Ni)i := Applying the procedure from above, we replace the individual- and 

transition-specific martingales with BiNp Here the random variable Bi is given by 

1 


B,=B 


{Y{fi), 


Y{ff) 


- 1 


(3.8) 


with Ti as above and all binomially-B{Y (Ti), 1/Y (Ti)) distributed random variables are assumed to b e indepen- 
dent given the data. Note that the subtraction of 1 in (IHJ corresponds to a centering at X]i"i ■^ 2 n;z/ inlAndersen 


et al.\ ([799ip this is done by subtracting the Nelson-Aalen estimator. The centering by 1 can also be deemed 


as E[77i|^„] = 0; note here that Y(Ti) > 0 for all i. Lurther, the variances are given by var(77i|.4„) = 
1 — Y{Ti)~^ 1, cf. Condition \2.1\ . This again shows the close connection between weird and wild bootstrap 

(with Poisson weights). However, these binomial objects are in general unconditionally dependent of the data since 
the above parameters depend on Y and Ti. 

(e) Moreover, other data-dependent multipliers that put different weights on observations depending on their time 
under study are conceivable. A special example is given at the end of the next section. 
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4 Deduced Inference Procedures 


In this section we exemplify some inferential applications of the developed methodology. Throughout, let / = 
[f 1 ) ^ 2 ] C [ 0 , r) again be any compact interval. 


4.1 Simultaneous Confidence Bands, One-Sample Tests and Confidence Intervals 


Following |Lin| ( [T997) l and ||Beyersmann et al. ( 2013| l time-simultaneous confidence bands can be constructed by the 
functional delta method as follows: 


1. We consider the transformed Aalen-Johansen estimator 7 n(f) = s/ng{t){(j)(Fi{t)) — (p{Fi{t))} 

2. with transformation (such as = log(— log(l — t))), 

3. weight function g (such as pi(f) = log(l — Fi{t))/a{t) or g 2 {t) = log(l — Fi(f))/(1 + a^(t))), 

4. variance estimator (t) = nvaf(Fi(f))/(l — Fi{t))^, 

5. and its corresponding resampling version jn{t) = g{t)4’'{Fi{t))W. 


The variance nvar(fi(f)) in the DDMB resampling version % is similar to the wild bootstrap variance estimator 


of 


Dobler and Pauly (12014|, where we now use the same DDMB weights as in . Again following 


Lin 


(19971 


and Beyersmann et al.[(2013 1 we call the bands resulting from gi and p 2 equal precision and Hall-Wellner bands, 


respectively. Simulating the 95% quantile 595 of sup^gj |7n(f)| (thereby keeping the data fixed) and using the 
transformation fii, approximate 95% confidence bands for (Fi(f))tg/ are obtained as 


1 _ (1 _ j?^(^))exp(±g.g5/(V"9(t)))^ f g /. 


(4.1) 


Equivalent Kolmogorov-Smirnov-type tests p for the null hypothesis : {Fi = F on/} for a prespecified 
function F are given as v? = 0 if and only if F is completely contained in the above confidence band. 

Finally, pointwise confidence intervals for the binomial probability Fi{s) = P{Xs = 1) for each s g J are 
immediately obtained by letting = ^2 = s so that / = {sj. 


4.2 Two-Sample Resampling Tests for Equal CIFs 

Another topic of interest is the comparison of two CIFs for the same risk but from independent sample groups 
with sample sizes rii and 7 ^ 2 , respectively. For this reason we introduce all quantities of the previous sections 
sample-specifically and denote them with a superscript k = 1,2. For example, F^^'^ is the second group’s CIF 
for the first risk, is the terminal point for observations in the hrst group and is the DDMB weight for 

where n = ni + 712 . Further, we define r = A and An = a{Ani ,An])- We would now like to 
construct non-parametric resampling tests for the hypotheses 

//= : = Fp^ on [^ 1 ,^ 2 ]} versus : {f|^^ 7 ^ F^^^^ on a subset A C [fi, ^ 2 ] such that > 0}, 

where A denotes Lebesgue measure. To this end we first introduce the two-sample version of as a scaled 
difference of Aalen-Johansen estimators, namely 
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Based on a similar martingale representation as in Equation (|2.10|i, we arrive at a DDMB version of Wn 


W, 


D 


nin2 

n 


2 ni 

(i: 

i=l 


D. 


( 1 ) 7 ( 1 ) 

2n;i^2n;i 


2n2 

V 7' 

/ , ^2n;i^2n-,i 
i=l 


7 ( 2 ) 


(4.2) 


see ( |3.2| i for the corresponding one-sample case. This gives us a generalization of the two-sample wild bootstrap 
statistic of Dobler and Pauly ( |2015[ ) where such resampling tests based on i.i.d. multipliers are compared to 
computationally less expensive approximate tests. 

Following the lines of [Dobler and Paiily ( [2015 1, we now construct several resampling tests for versus 
K^. This is accomplished by plugging the statistic W„i,n 2 its resampled version IT^ into a continuous 
functional ijj : I?[0, r] —)■ K such that ip{Wni,n 2 ) tends to infinity in probability if the alternative hypothesis 
is true. In this subsection the asymptotic statements are referred to as n —>■ c» and nxjn ^ k ^ (0,1). Since 
Wnj^^n 2 possess the same Gaussian limit distribution, the resulting test depending on ^/’(IEni,n 2 ) 

test statistic) and ip{W^^ (yielding a data-dependent critical value) is of asymptotic level a. Furthermore, the 
test is consistent, that is, it rejects the alternative hypothesis with probability tending to 1 as n — 00 . Thus, 
the following two theorems follow immediately from the weak convergence results of the preceding theorem for 


Wni,n 2 IF^ and from applications of the continuous mapping theorem. 


(k) 

Theorem 2 (A Kolmogorov-Smirnov-type test). Choose a triangular array of DDMB weights j, i = 1,, 2nk , 
k = 1,2, satisfying ( |3.3| l - ( |3.7| ) and let w : [ti, 12 ] —t (0, 00 ) be a bounded weight function. A consistent, asymp¬ 
totic level a resampling test for vi. is given by 


1 


> c 


.KS 


‘P 


KS 


,712 ('^) I 


< c 


KS 


where {■) is the (1 — a)-quantile of the conditional distribution 



Theorem 3 (A Cramer-von Mises-type test). Choose a triangular array of DDMB weights = 1,..., 2nk, 

k = 1,2, satisfying ( [331 i - ([^ and let w : (Oj 00 ) be an integrable weight function. A consistent, 

asymptotic level a resampling test for vi. is given by 


P 


CvM 


1 

0 


*2 




^^CvM 


^^CvM 


where is the (1 


a)-quantile of the conditional distribution 


C 






Remark 2. For given {D^ 2 n-i)i,k could also choose the DDMB weights as the slightly modified variables 

D^ 2 n-i “ (1 + ^ 2 ^-i fa’’ asymptotically negligible terms Op(l) which are supposed to be measurable w.r.t. 

An. In the article of \Dobler and Pauly\ \2014’^ it is seen that wild bootstrap tests may tend to be slightly too 
liberal for strongly unequal sample sizes or when censoring is present. Therefore, the choice of, for instance, 
Op(l) = 0 ( 1 ) = or its square root leads to slightly more conservative versions of the above tests in case 

of unequal sample sizes. In order to additionally account for censoring, we could even choose the rather bigger 

Op(l) = y(i) 7 ( 2 ) (^ 2 ) (assuming approximately equal censoring rates in both groups) since the denominator 

tends to be smaller the more individuals are censored. 
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5 Simulations 


The aim of the present simulation study is to assess the coverage probabilities of confidence bands for the first CIF 
in a situation similar to the real data example which is introduced and analyzed in Section]^ To this end, ties in 
the original data set have been broken. Data have been simulated from smoothed versions of the non-parametric 
estimators; see |Allignol et al.| ( |2011[ ) for a similar approach. Table [^reports comparable percentages of type 1 and 
type 2 events as well as of censorings for both the original data set and 50,000 simulated individuals. 



data-set 

simulations 

type 1 events 

38.21 

38.68 

type 2 events 

20.28 

20.06 

censorings 

41.51 

41.26 


Table 1: Percentages of types of observations. 


The simulations were conducted using the R-computing environment, version 3.1.3 (R Development Core 
Team, 2015), each with Ngim = 10, 000 simulation runs for simulations with up to 100 individuals under study. 
For larger groups of individuals, we have chosen Nsim — 1000 simulation runs due to the enormously increasing 
computational efforts. For determination of the random quantile g 95 we have run B — 999 bootstrap runs in each 
simulation step. We constructed both Hall-Wellner and equal precision bands on the time interval [.5, 5], each 
based on either standard normal, centered Poi(l) or weird bootstrap weights within the DDMB approach. Table 
gives the resulting coverage probability estimates for n G {50, 60,..., 100, 200, 300,636} simulated individuals 
under study in each simulation run, where n = 636 is the sample size of the data example studied in Section]^ 

All coverage probabilities in Table are too small for sample sizes n < 200, but with a tendency of better 
coverage probabilities for Poisson multipliers and the weird bootstrap. For these two resampling procedures, there 
is also a preference for equal precision bands. This is also the scenario which draws near to the nominal level 
for n = 300, while standard normal multipliers lead to a coverage probability less than 91% for both types of 
bands. All bootstrap variants approach the nominal level for n = 636. Finally, Figure [T]in the subsequent section 
shows an empirical probability of 51/636 « 8.0% for being at risk at f = 5— which reinforces the impression that 
the construction of bands was an ambitious aim for sample sizes of n < 100 . 


n 

normal 

Poisson 

weird 

50 

79.4 

80.77 

79.84 

60 

82.45 

82.68 

82.67 

70 

84.86 

85.59 

85.44 

80 

86.2 

86.74 

86.86 

90 

87.9 

88.21 

88.49 

100 

88.22 

89.07 

89.50 

200 

89.9 

91.5 

92.1 

300 

90.9 

93.6 

93.1 

636 

94.8 

94.1 

94.9 


(a) Hall-Wellner bands 


n 

normal 

Poisson 

weird 

50 

76.49 

80.11 

79.72 

60 

80.44 

84.22 

83.43 

70 

82.89 

86.34 

86.49 

80 

85.36 

87.93 

88.22 

90 

86.05 

89.67 

89.38 

100 

87.68 

90.55 

91.06 

200 

91.1 

93.3 

93.9 

300 

90.6 

95.2 

95.1 

636 

94.1 

95.7 

94.4 


(b) Equal precision bands 


Table 2: Per cent coverage probabilities of confidence bands. 
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6 A Real Data Example 


We consider data from the 4D study (Wanner et al. 2005| l, which was a prospective randomized controlled trial 
evaluating the effect of lipid lowering with atorvastatin in diabetic patients receiving hemodialysis. The primary 
outcome was a composite of death from cardiac causes, stroke, and non-fatal myocardial infarction, subject to the 
competing risk of death from other causes. The motivation of the trial was that statins are protective with respect 
to cardiovascular events for persons with type 2 diabetes mellitus without kidney disease, but a possible benefit 
in patients receiving hemodialysis had until then not been assessed. Schulgen et al. (|20051 l have discussed sample 
size planning with competing risks outcomes for the 4D study, and Allignol et al. (201 1|| have used the 4D study 


to advocate a simulation point of view for the interpretation of competing risks. 

[Wanner et al.| ( [2005 | l found a non-significant protective effect of atorvastatin on the cause-specific hazard of 
the primary outcome (hazard ratio 0.92 with 95%-confidence interval [0.77, 1.10]). There was essentially no 
difference between groups for the competing cause-specific hazard, implying similar CIFs in the groups; see 
Allignol et al. (201 l|l for an in-depth discussion. Hence, we restrict ourselves in this section to a one sample 


scenario and re-analyze the control group data (636 patients). The data have been made available in the R-package 
etm ( jBeyersmann et al.]|2012j l, and our results may therefore be checked for reproducibility. Ties have been broken 
as in SectionUl 


Figure [T] shows Hall-Wellner (left panel) and equal precision (right panel) bands for the CIF of the primary 
outcome, using the weird bootstrap and the wild bootstrap with both, standard normal and centered Poi{\) weights. 
Within each panel, differences between the bands are invisible to the naked eye. Table additionally shows the 
areas between upper and lower boundary of the confidence bands; differences are again negligible. 


The only notable difference is the form of both types of bands: While the Hall-Wellner bands’ boundaries seem 
to have almost the same distances for all points of time, the equal precision bands start with a narrower band at 
f = .5 which clearly becomes wider as time progresses. But eventually, the areas of both types of bands are again 
comparable. 

Figure [T] additionally shows pointwise, log-log-transformed confidence intervals. As expected, the pointwise 
intervals are narrower than the simultaneous bands, but the bands do perform competitively. 



Hall-Wellner 

Equal precision 

normal 

.4655 

.4621 

Poisson 

.4783 

.4770 

weird 

.4764 

.4746 


Table 3: Areas covered by the confidence bands in Figurej^for different resampling schemes. 


We also performed analogous analyses in a data subsample with 200 and 300 individuals. In line with our 
simulation results, the wild bootstrap with standard normal multipliers produced natTower bands, but - similar 
to the complete cohort - the differences between the different bands were of little practical importance in this 
example. In the analyses of the subsample, the bands again performed competetively when compared to pointwise 
confidence intervals. (Results not shown.) 


7 Discussion and Outlook 

We have introduced and rigorously justified the new data-dependent multiplier bootstrap for non-parametric anal¬ 
ysis of survival data. Observation may be restricted by independent right-censoring and left-truncation, but a strict 
i.i.d. setup is not required. Our developments have included the case where failure may be due to several competing 
risks, where resampling is particularly attractive due to lack of asymptotic pivotal approximations. Our general 
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Figure 1: Approximate 95% confidence bands for Fi using different DDMB weights: standard normal (—), 

centered Poi(l) (-), weird bootstrap (• — •) multipliers. The solid line in the middle is the corresponding Aalen- 

Johansen estimator. Pointwise 95% confidence intervals (••••) for Fi also based on a log — log transformation, 
plotted in dark grey, have been calculated using the R-package etm. Above of the plots the number of individuals 
under risk shortly before each half-year is indicated. 


framework includes both the wild bootstrap and the weird bootstrap as special cases. The wild bootstrap with stan¬ 
dard normal multipliers is a popular and computationally convenient technique (e.g., [Martinussen and Scheike ] 
20061. The weird bootstrap, introduced by Andersen et al. ( |1993 i in their essential book on Statistical Models 
Based on Counting Processes, appears to be rarely used, if at all, although it has been implemented in software. To 
the best of our knowledge, our paper is the first to rigorously show asymptotic correctness of the weird bootstrap 
in the present context. The variety of available resampling techniques raises the question of which bootstrap to 
use. Efron’s original proposal of repeatedly taking random samples with replacement from the randomly censored 
observations ( Efron] 1981| l is arguably closest to his original approach ([Efron 1979|l, but does rely on a strict i.i.d. 
setup; see also the discussion in [Andersen et al. (1993 Section IV.1.4). The wild bootstrap with standard normal 
multipliers is motivated by the martingale representations used in the proofs of weak convergence of the original 
estimators. In a nutshell, the idea is to replace asymptotic normality by hnite sample normality (because of normal 
multipliers, keeping the data fixed) with approximately the right covariance. The general wild bootstrap allows for 
non-normal multipliers, replacing hnite sample normality by approximate normality. But the weird bootstrap is 
perhaps the most natural resampling scheme for survival data. To see this, recall that one major reason for basing 
survival analysis on hazards is censoring. In our setting, and assuming for the time being independent random 
censorship by C, we have that 


= P{T e [t,t + dt),XT = j\T > t) = P{T e [t,t + dt),XT = j,T <C\T > t,C > t), 

where the hrst equality is the dehnition from Equation ( [2.1| i and the second equality follows because of random 
censoring. Independent censoring now essentially requires the last equation (reformulated using counting pro¬ 
cesses and at-risk processes) to hold rather than the existence of a latent censoring time, which is assumed to 
be stochastically independent of {T,Xt)- It is the second equality that, hrst of all, motivates the increments of 
the cause-specihc NAE, say dAj (t) = dNj (t)/Y{t). The weird bootstrap continues from this point by sampling 
B{Y (f), dAj (f))-distributed increments at the jump times of N. The fact that sampling is performed independently 
at the jump times is justihed by the asymptotic distribution of ^/n{Aj — Aj) having independent increments. 

Our simulation results have shown that one should keep alternatives to the wild bootstrap with the almost 
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exclusively used standard normal multipliers in mind. In the scenarios that we have considered, we found a 
preference for Poisson multipliers and for the weird bootstrap. [Beyersmann et al] ( |2013| l who only considered 
the wild bootstrap also found a preference for Poisson multipliers, but the differences in the present paper were 
more pronounced. We did not find noticeable differences between the approaches in the real data example, but 
our analysis illustrated that simultaneous confidence bands may perform competitively when compared to only 
pointwise confidence intervals. Such bands should be reported more often, because subject matter interest often 
does lie in survival curves rather than probabilities at fixed time points. 


We are currently investigating extensions of the new DDMB approach to multi-state and regression models, 
see e.g. Lin et al. (20001 or Scheike and Zhang (20031 for a normal multiplier application. In particular, the weird 
bootstrap naturally extends to these situations. 
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8 Appendix 


The conditional convergence of the finite-dimensional marginal distributions of a linear, resampled process statistic 


with DDMB weights can be concluded with the following lemma which generalizes Theorem A. 1 in Beyersmann 

(20131. To this end let || • || be a norm on d G N, and define S„ = X]”=i In the following let 

be a tr-field which contains : / = 1, • • ■, n). are specified in the following lemma. 


et al. 


a 


Lemma 8.1. Let the triangular array of random variables : Ll —> M" with finite second moments 

and the triangular array of Mf-valued random vectors fulfill the following six conditions: 


n 


/ ^ sn;2Sn:z ^ ^ 

i—\ 

, where F is a positive definite covariance matrix, 

(8.1) 

max |||„;j|| 

0, 

(8.2) 

sfn max \pn-%\ 

= y/n max \E[Dn-i\Cn]\0, 

2 = 1 ,...,n 

(8.3) 

^ —( 

1 

g 

r 

= max |var(iJ„;i|C„) — 1| 0 as n ^ 00 , 

2 = 1 ,...,n 

(8.4) 


n\C„) = C{Dn;i 1 Cn). 

(8.5) 




In addition, the weights tnay satisfy the Lindeberg condition in probability given C„, that is 


i=l 


E 


V" cr2 




=1 <3 


> e 


0 for all e > 0. 


( 8 . 6 ) 


Then the conditional weak convergence 


N(0, F) given Cn holds in probability. 
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Proof. Following the proof in Beyersmann et al. (20131 we only need to show that satisfies the conditional 
Lindeberg condition for dimension d = \. The case for general d G N follows from a modihed Cramer-Wold 


Theorem; see ( Pauly[ 2011 Theorem 4.1) for details. Thus, we calculate 


:= var(5^|C„) = var(7^„;,C„;,|C„) = 


2 2 


r > 0 


(8.7) 


2 = 1 


2=1 


by (|8T) and ([S^- Further, we write Sf;’ = J2'i=iiDn;i - Pn-Mn-i + Ya=i Pn-i^n-i =■ of which 

is asymptotically negligible by Cauchy-Schwarz’ inequality. Conditions (|8.1|l and (|8.3|l and Slutzky’s theorem: 


l-^n I ~ I 'y pn-,i^n;i 

i=l 


< max 
2=1,...,! 


I Mr, 


2 = 1 


^ 1 y/ 2 

(EC.) -^ 0 -^. 

’ ’ 2=1 


It remains to verify the conditional Lindeberg condition for in probability where we let = 0 

without loss of generality. For this last step we need that X]r=i(^n i ~ which can be easily shown 

using Condition ( |8.4| i and the convergence in ( |8.7| i. Thus, it follows that for all d > 0, 

n n 

p( _ 5 + r ^ < <5 + r^ ^ i. 

i=i 

Now, for all d, e, p > 0 sufficiently small and n sufficiently large we have 


(T^)-^Y.^[D%,en■MDn■4n■.^)^ > T^e)\Cn] 

2 = 1 

n _ n n 




II' _ II' 




i=i 

i=i 


r 


+ Op(l) 


.( 1 ) 


which is Op (1) by ( [831 i. Therefore, satisfies the Lindeberg condition given C„ in probability. 


□ 


Remark 3. See Beyersmann et ar\ {2013) to note that Conditions ( |8.11 l and ( |8.2[ ) are fulfilled for the triangular 
array {'^j))j,k and Cn = where, for each j = 1,... ,£,k = 1, 2, the vector (tf) = 

{Z^..y(tj ),..., (^j)) consists of the integrals w.r.t. counting processes given by ( |3.11 l and ( |4.21 i evaluated 

at arbitrary times ti,... ,1^ G I. Moreover, this choice for also fulfills the conditions of LemmaW^below. 


Let us now give a criterion for the tightness of linear, resampled process statistics in terms of the DDMB 
weights (i7n;i)i=i,....n and the data vectors ($,n-,i)i=i,...,n- Since tightness of a family of multivariate processes is 
equivalent to the tightness in each dimension, we here only consider the case of d = 1. Recall the Op-notation 
introduced above Theorem [T] 

Lemma 8.2. Let each : fl x J —>■ K, i = 1,..., n, he a stochastic process and suppose that, as n ^ oo. 


max \E[Dn-fiCn]\ G Op(n 

;=!,...,n 

(8.8) 

max E[Dl.i\Cn] G Op(l), 

;=!,...,n ’ 

(8.9) 

max \E[D^.^\Cn]\ G Opiy/njrn), 

;=!,...,n ’ 

(8.10) 


13 


















( 8 . 11 ) 


max E[DljCn]&Op{rJ), 

C{Dn;i,i = l,...,n\Cn) = C{Dn.^i\Cn), ( 8 . 12 ) 

i— 

n 

< Hr,{s) - H„{r) ^ H{s) - H{r), 0<r<s,r,sel, (8.13) 

i=l 

where H, _ff„ : fl x / —> [0, oo) are nondecreasing functions of which H is continuous and deterministic and where 
Tn = maxg^tg/ (g ))2 € (0,1). Then the family of probability measures C[S^\Cn) is 

tight in probability. 


Proof By its analogy to the proof of tightness for the exchangeably weighted bootstrapped Aalen-Johansen pro¬ 
cess in|Dobler and Pauly|(|2014|l, where the moment conditions for the (mixed) moments are now replaced by (|8.8|l 
we only need to consider the asymptotics of the involved moments therein; see the proof of their Theo¬ 
rem 3.1. In fact, moving on to the conditional expectations essentially does not effect the arguments of the referred 
proof. It is sufficient to verify that the existing proof holds with these modihcations. 

Note that we here analyze the conditional moments of without previously centering the DDMB weights at 
their arithmetic mean which had been necessary in the article by |Dobler and Paul}^ ( |2014| ). 

Two of those hve cases emerging in the referred proof require a separate consideration since our Lemma [8!^ is 
formulated in a greater generality. Therefore, we begin to note that, in the hrst sum on the right-hand side of (A.3) 

inli 


Dobler and Pauly (2014i, where E[Zl„.j|Cn] occurs, we also have factors like 




i=i 


i=i 


i=l 


This is why ( |8.11| l is sufficient for having reasonable upper bounds of this first sum. A similar argument is required 
for those sums where third moments occur, i.e., 

n 

- in-As)){in-As) “ Cn;i(r))^ 


= E 


- in-M) 






1/2 


1/2 


1 = 1 




Hence, Conditions ( |8.8| ) and ( |8.10| ) are sufficient for bounds of these sums. It remains to inspect 

maxE)!:)^.^!!^ -ICn] < max E[i:)^.j|C„]^ S C>p(l), 

2=1,...,n ’ 

n max \E[D‘^.^Dn■JDn■k\Cn]\ < max E[i:)^.,|C„]nE[H„y |C„]^ G 0^(1), 
max \E[Dn-,iDn-,jDn-kDn-,l\Cn]\ < max (V«E[£>n;i|C„])‘^ G C>p(l). 

pairwise different 2=1,...,n 
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It is also worth to mention that in fact a modified version of |Billingsley| ( |199^ , Theorem 13.5, is applied here such 
that the non-decreasing function therein may be replaced with a sequence of non-decreasing functions converging 
pointwise to a continuous one; see the remark in Jacod and Shiryaev (20031, p. 356. Since we are considering 
conditional expectations, this condition was translated into the convergence in probability in ( |8.13| l by applying 
the subsequence principle. □ 


Proof of Theorem^ The result follows from Le mmas|8.1| and [8^ taken into account Remarkj^and the calculations 
in the proof of Theorem 2 in Beyersmann et al. (2013|l to see that nrn S Op{l). Also, note that the condition 


max \E[Dl i\An]\&Op{n) 

l<i<2n 


is already fulfilled by (|3.5|) in combination with Jensen’s inequality applied with g : x ^ x 


.4/3 


□ 


Proof of Example^ Only (d) needs to be proven. The other examples are obviously special cases of the proposed 
DDMB of Theorem[T] For the weird bootstrap, the limits of conditional mean and variance are given as 


v^|E[s,|^„]| = V^(l-y(T,) 
and \y?tx{B^\An) - 1| = 


= 0 


Y{Tf} 


- 1 


sG[0,t] Pis) 


and the convergence is due to Condition ( |2.7| ). Obviously, the Lyapunov condition in Remark[TJa) holds too and 
(|3.6|l holds per definition of the Bi. Thus, we have shown that (|3.3|l - (|3.7|l are fulfilled. □ 
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